Methods and apparatus to generate electronic mobile measurement census data

ABSTRACT

An example apparatus includes at least one memory, instructions, and at least one processor to execute the instructions to generate electronic mobile measurement data based on network communications received from first client devices, select attributes associated with the electronic mobile measurement data to include in a model, generate the model based on the attributes and a first portion of the electronic mobile measurement data, determine a percentage of a second portion of the electronic mobile measurement data that the model correctly associates with corresponding first users of the first client devices, and when the percentage satisfies a threshold determine: (a) when a second user operating a second client device is a primary user, and (b) when the user operating the second client device is a third user, and associate demographic information of the second user with the electronic mobile measurement data to reduce a misattribution error.

RELATED APPLICATIONS

This patent arises from a continuation of U.S. patent application Ser.No. 16/268,173, filed Feb. 5, 2019, now U.S. Pat. No. ______, entitled“METHODS AND APPARATUS TO GENERATE ELECTRONIC MOBILE MEASUREMENT CENSUSDATA,” which is a continuation of U.S. patent application Ser. No.15/953,177, filed Apr. 13, 2018, now U.S. Pat. No. 10,217,122, entitled“METHOD, MEDIUM, AND APPARATUS TO GENERATE ELECTRONIC MOBILE MEASUREMENTCENSUS DATA,” which is a continuation of U.S. patent application Ser.No. 14/569,474, filed Dec. 12, 2014, now U.S. Pat. No. 9,953,330,entitled “METHODS, APPARATUS AND COMPUTER READABLE MEDIA TO GENERATEELECTRONIC MOBILE MEASUREMENT CENSUS DATA,” which claims the benefit ofU.S. Provisional Patent Application No. 61/952,729, filed Mar. 13, 2014,entitled “METHODS AND APPARATUS TO MODEL ACTIVITY ASSIGNMENT.” Priorityto U.S. patent application Ser. No. 16/268,173, U.S. patent applicationSer. No. 15/953,177, U.S. patent application Ser. No. 14/569,474, andU.S. Provisional Patent Application No. 61/952,729 is hereby claimed.U.S. patent application Ser. No. 16/268,173, U.S. patent applicationSer. No. 15/953,177, U.S. patent application Ser. No. 14/569,474, andU.S. Provisional Patent Application No. 61/952,729 are herebyincorporated herein by reference in their entireties.

FIELD OF THE DISCLOSURE

This disclosure relates generally to audience measurement and, moreparticularly, to generating electronic mobile measurement census data.

BACKGROUND

Traditionally, audience measurement entities determine audienceengagement levels for media programming based on registered panelmembers. That is, an audience measurement entity enrolls people whoconsent to being monitored into a panel. The audience measurement entitythen monitors those panel members to determine media (e.g., televisionprograms or radio programs, movies, DVDs, advertisements, etc.) exposedto those panel members. In this manner, the audience measurement entitycan determine exposure measures for different media based on thecollected media measurement data.

Techniques for monitoring user access to Internet resources such as webpages, advertisements and/or other media have evolved significantly overthe years. Some prior systems perform such monitoring primarily throughserver logs. In particular, entities serving media on the Internet canuse such prior systems to log the number of requests received for theirmedia at their server.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts an example system to collect impressions of mediapresented on mobile devices and to collect user information fromdistributed database proprietors for associating with the collectedimpressions.

FIG. 2 is an example system to collect impressions of media presented atmobile devices and to correct the impression data for misattributionerrors.

FIG. 3 illustrates an example table depicting attributes used togenerate the example activity assignment model of FIG. 2.

FIG. 4 illustrates an example implementation of the example impressioncorrector of FIG. 2 to associate a member of a panelist household to alogged impression collected from a mobile device in the same panelisthousehold.

FIG. 5 depicts an example system to determine correction factors tocorrect impression data for misattributions errors.

FIG. 6 is a flow diagram representative of example machine readableinstructions that may be executed to implement the example impressioncorrector of FIGS. 2, 4, and 5 to associate a member of the panelisthousehold to log impressions from an electronic device.

FIG. 7 is a flow diagram representative of example machine readableinstructions that may be executed to implement the example assignmentmodeler of FIG. 2 to generate the activity assignment model.

FIG. 8 is a block diagram of an example processor system structured toexecute the example machine readable instructions represented by FIG. 6and/or 7 to implement the example impression corrector and/or assignmentmodeler of FIGS. 2 and/or 4.

DETAILED DESCRIPTION

Examples disclosed herein may be used to generate and use models tocorrect for misattribution errors in collected impressions reported byelectronic devices. As used herein, an impression is an instance of aperson's exposure to media (e.g., content, advertising, etc.). When animpression is logged to track an audience for particular media, theimpression may be associated with demographics of the personcorresponding to the impression. This is referred to as attributingdemographic data to an impression, or attributing an impression todemographic data. In this manner, media exposures of audiences and/ormedia exposures across different demographic groups can be measured.However, misattribution errors in collected impressions can occur whenincorrect demographic data is attributed to an impression by incorrectlyassuming which person corresponds to a logged impression. Suchmisattribution errors can significantly decrease the accuracies of mediameasurements. To improve accuracies of impression data havingmisattribution errors, examples disclosed herein may be used tore-assign logged impressions to different people (and, thus demographicdata) identified as having a higher probability or likelihood of beingthe person corresponding to the logged impression. Examples disclosedherein perform such re-assigning of logged impressions to differentdemographic data by generating and using activity assignment models.

An audience measurement entity (AME) measures the size of audiencesexposed to media to produce ratings. Ratings are used by advertisersand/or marketers to purchase advertising space and/or design advertisingcampaigns. Additionally, media producers and/or distributors use theratings to determine how to set prices for advertising space and/or tomake programming decisions. As a larger portion of audiences useportable devices (e.g., tablets, smartphones, etc.) to access media,advertisers and/or marketers are interested in accurately calculatedratings (e.g. mobile television ratings (MTVR), etc.) for media accessedon these devices.

To measure audiences on mobile devices, an AME may use instructions(e.g., Java, java script, or any other computer language or script)embedded in media as describe below in connection with FIG. 1 to collectinformation indicating when audience members are accessing media on amobile device. Media to be traced are tagged with these instructions.When a device requests the media, both the media and the instructionsare downloaded to the client. The instructions cause information aboutthe media access to be sent from a mobile device to a monitoring entity(e.g., the AME). Examples of tagging media and tracing media throughthese instructions are disclosed in U.S. Pat. No. 6,108,637, issued Aug.22, 2000, entitled “Content Display Monitor,” which is incorporated byreference in its entirety herein.

Additionally, the instructions cause one or more user and/or deviceidentifiers (e.g., an international mobile equipment identity (IMEI), amobile equipment identifier (MEID), a media access control (MAC)address, an app store identifier, an open source unique deviceidentifier (OpenUDID), an open device identification number (ODIN), alogin identifier, a username, an email address, user agent data,third-party service identifiers, web storage data, document object model(DOM) storage data, local shared objects, an automobile vehicleidentification number (VIN), etc.) located on a mobile device to be sentto a partnered database proprietor (e.g., Facebook, Twitter, Google,Yahoo!, MSN, Apple, Experian, etc.) to identify demographic information(e.g., age, gender, geographic location, race, income level, educationlevel, religion, etc.) for the audience member of the mobile devicecollected via a user registration process. For example, an audiencemember may be viewing an episode of “The Walking Dead” in a mediastreaming app. In that instance, in response to instructions executingwithin the app, a user/device identifier stored on the mobile device issent to the AME and/or a partner database proprietor to associate theinstance of media exposure (e.g., an impression) to correspondingdemographic data of the audience member. The database proprietor canthen send logged demographic impression data to the AME for use by theAME in generating, for example, media ratings and/or other audiencemeasures. In some examples, the partner database proprietor does notprovide individualized demographic data (e.g., user-level demographics)in association with logged impressions. Instead, in some examples, thepartnered database proprietor provides aggregate demographic impressiondata (sometime referred to herein as “aggregate census data”). Forexample, the aggregate demographic impression data provided by thepartner database proprietor may state that a thousand males age 17-34watched the episode of “The Walking Dead” in the last seven days viamobile devices. However, the aggregate demographic data from the partnerdatabase proprietor does not identify individual persons (e.g., is notuser-level data) associated with individual impressions. In this manner,the database proprietor protects the privacies of its subscribers/usersby not revealing their identities and, thus, user-level media accessactivities, to the AME.

The AME uses this aggregate census data to calculate ratings and/orother audience measures for corresponding media. However, because mobiledevices can be shared, misattribution can occur within the aggregatecensus data. Misattribution occurs when an impression corresponding toan individual in a first demographic group is attributed to anindividual in a second demographic group. For example, initially, afirst person in a household uses the mobile device to access a web siteassociated with a database proprietor (e.g., via a web browser of themobile device, via an app installed on the mobile device, etc.), and thedatabase proprietor may recognize the first person as being associatedwith the mobile device based on the access (e.g., a login event and/orother user-identifying event) by the first person. Subsequently, thefirst person stops using the device but does not log out of the databaseproprietor system on the device (or does not otherwise notify thedatabase proprietor system and/or device that he/she is no longer usingthe device) and/or the second person does not log in to the databaseproprietor system (or perform any other user-identifying activity) toallow the database proprietor to recognize the second person as adifferent user than the first person. Consequently, when the secondperson begins using the same mobile device to access media, the databaseproprietor continues to (in this case, incorrectly) recognize mediaaccesses of the mobile device (e.g., media impressions) as beingassociated with the first person. Therefore, impressions that should beattributed to the second person and the second demographic group areincorrectly attributed (e.g., misattributed) to the first person and thefirst demographic group. For example, a 17-year old male householdmember may use a mobile device of a 42-year old female to watch “TheWalking Dead.” In such an example, if the 42-year old female is notlogged out of a user-identifying service, too, or app (e.g., a socialnetworking service, tool, or app or any other user identifying service,tool, or app on the mobile device), the impression that occurs when the17-year old accesses “The Walking Dead” media will be misattributed tothe 42-year old female. The effect of large-scale misattribution errormay create measurement bias error by incorrectly representing thedemographic distribution of media impressions across a large audienceand, therefore, misrepresenting the audience demographics of impressionscollected for advertisements and/or other media to which exposure ismonitored by the AME.

Misattribution error also occurs when a mobile device is generallyassociated with use by a particular household member, but occasionallyused by another person. In such examples, one or more user/deviceidentifiers (e.g., an international mobile equipment identity (IMEI), amobile equipment identifier (MEID), a media access control (MAC)address, an app store identifier, an open source unique deviceidentifier (OpenUDID), an open device identification number (ODIN), alogin identifier, a username, an email address, user agent data,third-party service identifiers, web storage data, document object model(DOM) storage data, local shared objects, an automobile vehicleidentification number (VIN), etc.) on the device is/are associated at adatabase proprietor with the particular household member as describedbelow in connection with FIG. 1. As such, when the particular householdmember uses the mobile device to access media, and the media accessesare reported to a database proprietor along with one or more user/deviceidentifier(s), the database proprietor logs impressions of the mediaaccesses in association with the demographic information identifiedbased on the user/device identifier(s). However, on the occasion whenthe mobile device is shared and used by a second household member, mediaaccesses during such time that are reported to the database proprietoralong with the user/device identifier(s) of the mobile device areincorrectly attributed (e.g., misattributed) to the particular householdmember that is associated with the mobile device rather than beingattributed to the second household member.

To correct impression data for misattribution errors, the AME usesresponses to a survey conducted on randomly selected people and/orhouseholds to calculate correction factors. Such a survey is sometimesreferred to herein as a probability survey. Survey responses includeinformation about demographics of each member of the household, types ofdevices in the household, which members of the household use whichdevices, media viewing preferences, which members of the household areregistered with which database proprietors, etc. The AME calculates thecorrection factors based on responses to the probability survey. Thecorrection factors represent how often the impressions of onedemographic group are misattributed to another group. For example, amisattribution factor may state that, in a household with a male, age17-24 and a female, age 35-46, 1.56% of the exposure data attributed tothe female, age 35-46 should be attributed to the male, age 17-24. Insome examples, the correction factors are calculated for differentcharacteristics of users, devices and/or media (e.g., age, gender,device type, media genre, etc.). For example, the misattribution errorbetween a first demographic group and a second demographic group may bedifferent on a tablet as compared to a smartphone. In such instances,the correction factors calculated for media accessed on a tablet wouldbe different than the correction factors calculated for media accessedon a smartphone.

In some examples, the probability survey responses do not providedetailed information about which member of a household was exposed towhich media category (e.g., comedy, drama, reality, etc.). For example,if during a survey of a household, a male, age 54-62, a female, age62-80 and a female, age 18-34 indicated they watch drama programming ona tablet device, the AME assumes that each of those members of thehousehold produce one-third of the impression data associated withaccessing drama programming on a monitored tablet device of thehousehold. However, the media exposure habits of the members of thehousehold may be different. For example, the male, age 54-62 may onlyaccess 10% of the drama programming on the tablet, while the female, age62-80 accesses 50% of the drama programming and the female, age 18-34accesses 40% of the drama programming.

As disclosed below, to increase accuracies of misattribution correctionfactors that are for use in correcting for misattribution errors inaggregate demographic impression data generated by database proprietors,the AME may use census data generated from demographic impressions ofpanelists recruited to participate in an AME panel (sometimes referredto herein as “electronic mobile measure (EMM) panelists”) on mobiledevices. Demographic impression data collected through EMM panelists ishighly accurate because the AME collects highly accurate demographicinformation from the EMM panelists and the EMM panelists consent todetailed monitoring of their accesses to media on mobile devices.

As used herein, a demographic impression is defined to be an impressionthat is associated with a characteristic (e.g., a demographiccharacteristic) of a person exposed to media. EMM panelist census data(sometimes referred to as “EMM census data” or “EMMC data”) is definedherein to be demographic impression data that includes impression data(e.g., data representative of an impression, such as program identifier(ID), channel and/or application ID, time, date, etc.) of the EMMpanelists combined with the corresponding demographic information of theEMM panelists. In some examples, EMM panelists may be identified byusing user/device identifiers on the mobile device that are collected byinstructions or data collectors in apps used to access media.Alternatively or additionally, EMM panelists may be identified using AMEand/or partnered database proprietor cookies set on the mobile devicevia, for example, a web browser. For example, in response toinstructions executing in a television viewing app, a media access app,or an Internet web browser, the mobile device may send impression dataand a user/device identifier (e.g., EMM panelist ID, database proprietorID, etc.) and/or cookie to the AME, a database proprietor, and/or anyother entity that collects such information.

However, in households with multiple people, more than one person mayshare an EMM panelist's mobile device to access media without providingan indication of which member of the household is using the device. Assuch, impressions reported by the shared mobile device are misattributedto the wrong household member (e.g., misattributed to the EMM panelistregarded as being associated with the mobile device). For example, a10-year old female household member may be using a 28-year old male EMMpanelist's mobile device. In such an example, the impression datagenerated while the 10-year old female was using the mobile device wouldbe misattributed to the 28-year old male EMM panelist. Suchmisattributions reduce the accuracy of the EMM census data.

As disclosed below the AME generates an activity assignment model (AAM)using historical exposure data to correct for misattribution errors inEMM census data. In disclosed examples, the AAM determines theprobability (sometimes referred to herein as “a probability score”) thata person with certain characteristics (e.g., age, gender, ethnicity,household size, etc.) would access a television program with certaincharacteristics (e.g., genre, etc.) at a certain time (e.g., day of theweek, daypart, etc.). In disclosed examples, the AME also collectsdemographic information of other members of the panelist's household andinformation regarding their usage of mobile devices in the household(sometimes referred to herein as “supplemental survey data”). Forexample, the usage information may include types of mobile devices usedin the household, the primary users of the mobile devices, and/orwhether the EMM panelist's mobile device(s) is/are shared with othermembers of the household.

As disclosed below, using the AAM and the supplemental survey data, theAME corrects for misattribution errors in the EMM census data. In someexamples disclosed herein, the AME assumes that the EMM panelistaccessed the program on a mobile device that generated an impressionrequest (e.g., a request to log an impression at the AME). When such anassumption is made, the AME determines if the presumption should and/orcan be overcome. In some examples, the presumption is overcome if aprobability score calculated (e.g., using the AAM) for the EMM panelistdoes not satisfy (e.g., is less than) a calibrated threshold. If thepresumption is overcome, the AME assigns the logged impression to adifferent member of the EMM panelist's household. By processing numerouslogged impressions from across numerous households in this manner, theAME generates AAM-adjusted EMM census data using examples disclosedherein.

As disclosed below, misattribution correction factors generated by theAME are calibrated using the AAM-adjusted EMM census data. TheAAM-adjusted EMM census data is used to determine household sharingpatterns indicative of members of a household who accessed a mediacategory (e.g., media categorized by genre, etc.) on the mobile deviceand what percentage of audience activity is attributable to whichhousehold member. For example, in a household that shares a tablet toaccess (e.g., view, listen to, etc.) media, the AAM-adjusted EMM censusdata may indicate that a male, age 18-34 accesses 45% of the comedymedia presented on the tablet, a female, age 18-43 accesses 20% of thecomedy media presented on the tablet, a male, age 2-12 accesses 25% ofthe comedy media presented on the tablet, and a female, age 13-17accesses 0% of the comedy media presented on the tablet. The AME usesthe household sharing patterns combined with information included in thesupplemental survey data (e.g., the database proprietor accounts eachhousehold member, devices used to access the database proprietoraccounts by each household member, etc.) to calibrate the misattributioncorrection factors produced using the probability survey. In someexamples, the AAM-adjusted EMM census data may be used to generate themisattribution correction factors in place of the probability survey.

In some examples, the AME contracts and/or enlists panelists using anydesired methodology (e.g., random selection, statistical selection,phone solicitations, Internet advertisements, surveys, advertisements inshopping malls, product packaging, etc.). Demographic information (e.g.,gender, occupation, salary, race and/or ethnicity, marital status,highest completed education, current employment status, etc.) isobtained from a panelist when the panelist joins (e.g., registers for)one or more panels (e.g., the EMM panel). For example, EMM panelistsagree to allow the AME to monitor their media accesses on mobile devices(e.g., television programming accessed through a browser or an app,etc.). In some examples, to facilitate monitoring media accesses, theAME provides a metering app (e.g., an app used to associate the mobiledevice with the panelist) to the panelist after the panelist enrolls inthe EMM panel.

Disclosed example methods generating electronic media measurement censusdata involve logging an impression based on a communication receivedfrom a client device, the logged impression corresponding to mediaaccessed at the client device. The example methods further involve, whena panelist associated with the client device is determined to be anaudience member of the media on the client device, associatingdemographic information of the panelist with the logged impressionassociated with the media. The example methods further involve when thepanelist associated with the client device is determined not to be theaudience member of the media at the client device, determiningprobability scores for respective household members residing in ahousehold with the panelist, the probability scores indicative ofprobabilities that corresponding ones of the household members are theaudience member of the media at the client device, and associatingdemographic information of one of the household members that has ahighest probability score with the logged impression associated with themedia.

In some example methods, determining whether the panelist associatedwith the client device is the audience member of the media presented onthe client device further comprises determining that the panelistassociated with the client device is the audience member of the media atthe mobile device if a size of the household equals one.

In some example methods, determining whether the panelist associatedwith the client device is the audience member of the media presented onthe client device further comprises determining that the panelistassociated with the client device is the audience member of the media atthe client device if the panelist has indicated that the panelist doesnot share the client device.

In some example methods, determining whether the panelist associatedwith the client device is the audience member of the media presented onthe client device further comprises determining that the panelistassociated with the client device is the audience member of the media atthe client device if a probability score calculated for the panelistsatisfies a threshold. In some example methods, the threshold is acalibration factor divided by the size of the household. In some suchexample methods, the calibration factor is based on demographicinformation of the panelist and a type of the client device. In somesuch example methods, the demographic information of the panelist andthe type of the client device correspond to a first demographic group,and the calibration factor is a ratio of an average time that the firstdemographic group accessed the media and an average time that alldemographic groups accessed the media.

In some example methods, the required processing resources on the clientdevice are reduce by not requiring the user of the client device toself-identify.

Disclosed example apparatus include an impression server to log animpression based on a communication received from a client device, thelogged impression corresponding to media accessed at the client device.The example apparatus further includes a probability calculator to, whena panelist associated with the client device is determined not to be theperson who accessed the media at the client device, determineprobability scores for respective household members residing in ahousehold with the panelist, the probability scores indicative ofprobabilities that corresponding ones of the household members are theperson who accessed the media at the client device. The exampleapparatus further includes a processor to, when a panelist associatedwith the client device is determined to be the person who accessed themedia at the client device, associate demographic information of thepanelist with the logged impression associated with the media, and whenthe panelist associated with the client device is determined to not bethe person who accessed the media at the client device, associatedemographic information of one of the household members that has ahighest probability score with the logged impression associated with themedia.

In some example apparatus, to determine whether the panelist associatedwith the client device is the person who accessed the media at theclient device, the probability calculator is further to determine thatthe panelist associated with the client device is the person whoaccessed the media at the client device if a size of the householdequals one.

In some example apparatus, to determine whether the panelist associatedwith the client device is the person who accessed the media at theclient device, the probability calculator is further to determine thatthe panelist associated with the client device is the person whoaccessed the media at the client device if the panelist has indicatedthat the panelist does not share the client device.

In some example apparatus, to determine whether the panelist associatedwith the client device is the person who accessed the media at theclient device, the probability calculator is further to determine if aprobability score calculated for the panelist satisfies a threshold. Insome such apparatus, the threshold is a calibration factor divided bythe size of the household. In some such apparatus, the calibrationfactor is based on demographic information of the panelist and a type ofthe client device. In some such apparatus, the demographic informationof the panelist and the type of the client device define a firstdemographic group, and the calibration factor is a ratio of an averagetime that the first demographic group accessed the media and an averagetime that all demographic groups accessed the media.

FIG. 1 depicts an example system 100 to collect user information (e.g.,user information 102 a, 102 b) from distributed database proprietors 104a, 104 b for associating with impressions of media presented at a clientdevice 106. In the illustrated examples, user information 102 a, 102 bor user data includes one or more of demographic data, purchase data,and/or other data indicative of user activities, behaviors, and/orpreferences related to information accessed via the Internet, purchases,media accessed on electronic devices, physical locations (e.g., retailor commercial establishments, restaurants, venues, etc.) visited byusers, etc. Examples disclosed herein are described in connection with amobile device, which may be a mobile phone, a mobile communicationdevice, a tablet, a gaming device, a portable media presentation device,an in-vehicle or vehicle-integrated communication system, such as anautomobile infotainment system with wireless communication capabilities,etc. However, examples disclosed herein may be implemented in connectionwith non-mobile devices such as internet appliances, smart televisions,internet terminals, computers, or any other device capable of presentingmedia received via network communications.

In the illustrated example of FIG. 1, to track media impressions on theclient device 106, an audience measurement entity (AME) 108 partnerswith or cooperates with an app publisher 110 to download and install adata collector 112 on the client device 106. The app publisher 110 ofthe illustrated example may be a software app developer that developsand distributes apps to mobile devices and/or a distributor thatreceives apps from software app developers and distributes the apps tomobile devices. The data collector 112 may be included in other softwareloaded onto the client device 106, such as the operating system 114, anapplication (or app) 116, a web browser 117, and/or any other software.In some examples, the example client device 106 of FIG. 1 is anon-locally metered device. For example, the client device 106 of anon-panelist household does not support and/or has not been providedwith specific metering software (e.g., dedicated metering softwareprovided directly by the AME 108 and executing as a foreground orbackground process for the sole purpose of monitoring mediaaccesses/exposure).

Any of the example software 114-117 may present media 118 received froma media publisher 120. The media 118 may be an advertisement, video,audio, text, a graphic, a web page, news, educational media,entertainment media, or any other type of media. In the illustratedexample, a media ID 122 is provided in the media 118 to enableidentifying the media 118 so that the AME 108 can credit the media 118with media impressions when the media 118 is presented on the clientdevice 106 or any other device that is monitored by the AME 108.

The data collector 112 of the illustrated example includes instructions(e.g., Java, java script, or any other computer language or script)that, when executed by the client device 106, cause the client device106 to collect the media ID 122 of the media 118 presented by the appprogram 116 and/or the client device 106, and to collect one or moredevice/user identifier(s) 124 stored in the client device 106. Thedevice/user identifier(s) 124 of the illustrated example includeidentifiers that can be used by corresponding ones of the partnerdatabase proprietors 104 a-b to identify the user or users of the clientdevice 106, and to locate user information 102 a-b corresponding to theuser(s). For example, the device/user identifier(s) 124 may includehardware identifiers (e.g., an international mobile equipment identity(IMEI) a mobile equipment identifier (MEID), a media access control(MAC) address, etc.), an app store identifier (e.g., a Google AndroidID, an Apple ID, an Amazon ID, etc.), an open source unique deviceidentifier (OpenUDID), an open device identification number (ODIN), alogin identifier (e.g., a username), an email address, user agent data(e.g., application type, operating system, software vendor, softwarerevision, etc.), third-party service identifiers (e.g., advertisingservice identifiers, device usage analytics service identifiers,demographics collection service identifiers), web storage data, documentobject model (DOM) storage data, local shared objects (also referred toas “Flash cookies”), an automobile vehicle identification number (VIN),etc. In some examples, fewer or more device/user identifier(s) 124 maybe used. In addition, although only two partner database proprietors 104a-b are shown in FIG. 1, the AME 108 may partner with any number ofpartner database proprietors to collect distributed user information(e.g., the user information 102 a-b).

In some examples, the client device 106 may not allow access toidentification information stored in the client device 106. For suchinstances, the disclosed examples enable the AME 108 to store anAME-provided identifier (e.g., an identifier managed and tracked by theAME 108) in the client device 106 to track media impressions on theclient device 106. For example, the AME 108 may provide instructions inthe data collector 112 to set an AME-provided identifier in memory spaceaccessible by and/or allocated to the app program 116. The datacollector 112 uses the identifier as a device/user identifier 124. Insuch examples, the AME-provided identifier set by the data collector 112persists in the memory space even when the app program 116 and the datacollector 112 are not running. In this manner, the same AME-providedidentifier can remain associated with the client device 106 for extendeddurations and from app to app. In some examples in which the datacollector 112 sets an identifier in the client device 106, the AME 108may recruit a user of the client device 106 as a panelist, and may storeuser information collected from the user during a panelist registrationprocess and/or collected by monitoring user activities/behavior via theclient device 106 and/or any other device used by the user and monitoredby the AME 108. In this manner, the AME 108 can associate userinformation of the user (from panelist data stored by the AME 108) withmedia impressions attributed to the user on the client device 106.

In the illustrated example, the data collector 112 sends the media ID122 and the one or more device/user identifier(s) 124 as collected data126 to the app publisher 110. Alternatively, the data collector 112 maybe configured to send the collected data 126 to another collectionentity (other than the app publisher 110) that has been contracted bythe AME 108 or is partnered with the AME 108 to collect media ID's(e.g., the media ID 122) and device/user identifiers (e.g., thedevice/user identifier(s) 124) from mobile devices (e.g., the clientdevice 106). In the illustrated example, the app publisher 110 (or acollection entity) sends the media ID 122 and the device/useridentifier(s) 124 as impression data 130 to a server 132 at the AME 108.The impression data 130 of the illustrated example may include one mediaID 122 and one or more device/user identifier(s) 124 to report a singleimpression of the media 118, or it may include numerous media ID's 122and device/user identifier(s) 124 based on numerous instances ofcollected data (e.g., the collected data 126) received from the clientdevice 106 and/or other mobile devices to report multiple impressions ofmedia.

In the illustrated example, the server 132 stores the impression data130 in an AME media impressions store 134 (e.g., a database or otherdata structure). Subsequently, the AME 108 sends the device/useridentifier(s) 124 to corresponding partner database proprietors (e.g.,the partner database proprietors 104 a-b) to receive user information(e.g., the user information 102 a-b) corresponding to the device/useridentifier(s) 124 from the partner database proprietors 104 a-b so thatthe AME 108 can associate the user information with corresponding mediaimpressions of media (e.g., the media 118) presented at mobile devices(e.g., the client device 106).

In some examples, to protect the privacy of the user of the clientdevice 106, the media identifier 122 and/or the device/useridentifier(s) 124 are encrypted before they are sent to the AME 108and/or to the partner database proprietors 104 a-b. In other examples,the media identifier 122 and/or the device/user identifier(s) 124 arenot encrypted.

After the AME 108 receives the device/user identifier(s) 124, the AME108 sends device/user identifier logs 136 a-b to corresponding partnerdatabase proprietors (e.g., the partner database proprietors 104 a-b).In some examples, each of the device/user identifier logs 136 a-bincludes a single device/user identifier. In some examples, some or allof the device/user identifier logs 136 a-b include numerous aggregatedevice/user identifiers received at the AME 108 over time from one ormore mobile devices. After receiving the device/user identifier logs 136a-b, each of the partner database proprietors 104 a-b looks up its userscorresponding to the device/user identifiers 124 in the respective logs136 a-b. In this manner, each of the partner database proprietors 104a-b collects user information 102 a-b corresponding to users identifiedin the device/user identifier logs 136 a-b for sending to the AME 108.For example, if the partner database proprietor 104 a is a wirelessservice provider and the device/user identifier log 136 a includes IMEInumbers recognizable by the wireless service provider, the wirelessservice provider accesses its subscriber records to find users havingIMEI numbers matching the IMEI numbers received in the device/useridentifier log 136 a. When the users are identified, the wirelessservice provider copies the users' user information to the userinformation 102 a for delivery to the AME 108.

In some other examples, the example data collector 112 sends thedevice/user identifier(s) 124 from the client device 106 to the apppublisher 110 in the collected data 126, and it also sends thedevice/user identifier(s) 124 to the media publisher 120. In such otherexamples, the data collector 112 does not collect the media ID 122 fromthe media 118 at the client device 106 as the data collector 112 does inthe example system 100 of FIG. 1. Instead, the media publisher 120 thatpublishes the media 118 to the client device 106 retrieves the media ID122 from the media 118 that it publishes. The media publisher 120 thenassociates the media ID 122 to the device/user identifier(s) 124received from the data collector 112 executing in the client device 106,and sends collected data 138 to the app publisher 110 that includes themedia ID 122 and the associated device/user identifier(s) 124 of theclient device 106. For example, when the media publisher 120 sends themedia 118 to the client device 106, it does so by identifying the clientdevice 106 as a destination device for the media 118 using one or moreof the device/user identifier(s) 124 received from the client device106. In this manner, the media publisher 120 can associate the media ID122 of the media 118 with the device/user identifier(s) 124 of theclient device 106 indicating that the media 118 was sent to theparticular client device 106 for presentation (e.g., to generate animpression of the media 118).

Alternatively, in some other examples in which the data collector 112 isconfigured to send the device/user identifier(s) 124 to the mediapublisher 120, and the data collector 112 does not collect the media ID122 from the media 118 at the client device 106, the media publisher 102sends impression data 130 to the AME 108. For example, the mediapublisher 120 that publishes the media 118 to the client device 106 alsoretrieves the media ID 122 from the media 118 that it publishes, andassociates the media ID 122 with the device/user identifier(s) 124 ofthe client device 106. The media publisher 120 then sends the mediaimpression data 130, including the media ID 122 and the device/useridentifier(s) 124, to the AME 108. For example, when the media publisher120 sends the media 118 to the client device 106, it does so byidentifying the client device 106 as a destination device for the media118 using one or more of the device/user identifier(s) 124. In thismanner, the media publisher 120 can associate the media ID 122 of themedia 118 with the device/user identifier(s) 124 of the client device106 indicating that the media 118 was sent to the particular clientdevice 106 for presentation (e.g., to generate an impression of themedia 118). In the illustrated example, after the AME 108 receives theimpression data 130 from the media publisher 120, the AME 108 can thensend the device/user identifier logs 136 a-b to the partner databaseproprietors 104 a-b to request the user information 102 a-b as describedabove.

Although the media publisher 120 is shown separate from the apppublisher 110 in FIG. 1, the app publisher 110 may implement at leastsome of the operations of the media publisher 120 to send the media 118to the client device 106 for presentation. For example, advertisementproviders, media providers, or other information providers may sendmedia (e.g., the media 118) to the app publisher 110 for publishing tothe client device 106 via, for example, the app program 116 when it isexecuting on the client device 106. In such examples, the app publisher110 implements the operations described above as being performed by themedia publisher 120.

Additionally or alternatively, in contrast with the examples describedabove in which the client device 106 sends identifiers to the audiencemeasurement entity 108 (e.g., via the application publisher 110, themedia publisher 120, and/or another entity), in other examples theclient device 106 (e.g., the data collector 112 installed on the clientdevice 106) sends the identifiers (e.g., the user/device identifier(s)124) directly to the respective database proprietors 104 a, 104 b (e.g.,not via the AME 108). In such examples, the example client device 106sends the media identifier 122 to the audience measurement entity 108(e.g., directly or through an intermediary such as via the applicationpublisher 110), but does not send the media identifier 122 to thedatabase proprietors 104 a-b.

As mentioned above, the example partner database proprietors 104 a-bprovide the user information 102 a-b to the example AME 108 for matchingwith the media identifier 122 to form media impression information. Asalso mentioned above, the database proprietors 104 a-b are not providedcopies of the media identifier 122. Instead, the client device 106provides the database proprietors 104 a-b with impression identifiers140. An impression identifier 140 uniquely identifies an impressionevent relative to other impression events of the client device 106 sothat an occurrence of an impression at the client device 106 can bedistinguished from other occurrences of impressions. However, theimpression identifier 140 does not itself identify the media associatedwith that impression event. In such examples, the impression data 130from the client device 106 to the AME 108 also includes the impressionidentifier 140 and the corresponding media identifier 122. To match theuser information 102 a-b with the media identifier 122, the examplepartner database proprietors 104 a-b provide the user information 102a-b to the AME 108 in association with the impression identifier 140 forthe impression event that triggered the collection of the userinformation 102 a-b. In this manner, the AME 108 can match theimpression identifier 140 received from the client device 106 via theimpression data 130 to a corresponding impression identifier 140received from the partner database proprietors 104 a-b via the userinformation 102 a-b to associate the media identifier 122 received fromthe client device 106 with demographic information in the userinformation 102 a-b received from the database proprietors 104 a-b.

The impression identifier 140 of the illustrated example is structuredto reduce or avoid duplication of audience member counts for audiencesize measures. For example, the example partner database proprietors 104a-b provide the user information 102 a-b and the impression identifier140 to the AME 108 on a per-impression basis (e.g., each time a clientdevice 106 sends a request including an encrypted identifier 208 a-b andan impression identifier 140 to the partner database proprietor 104 a-b)and/or on an aggregated basis. When aggregate impression data isprovided in the user information 102 a-b, the user information 102 a-bincludes indications of multiple impressions (e.g., multiple impressionidentifiers 140) at mobile devices. In some examples, aggregateimpression data includes unique audience values (e.g., a measure of thequantity of unique audience members exposed to particular media), totalimpression count, frequency of impressions, etc. In some examples, theindividual logged impressions are not discernable from the aggregateimpression data.

As such, it is not readily discernable from the user information 102 a-bwhether instances of individual user-level impressions logged at thedatabase proprietors 104 a, 104 b correspond to the same audience membersuch that unique audience sizes indicated in the aggregate impressiondata of the user-information 102 a-b are inaccurate for being based onduplicate counting of audience members. However, the impressionidentifier 140 provided to the AME 108 enables the AME 108 todistinguish unique impressions and avoid overcounting a number of uniqueusers and/or devices accessing the media. For example, the relationshipbetween the user information 102 a from the partner A databaseproprietor 104 a and the user information 102 b from the partner Bdatabase proprietor 104 b for the client device 106 is not readilyapparent to the AME 108. By including an impression identifier 140 (orany similar identifier), the example AME 108 can associate userinformation corresponding to the same user between the user information102 a-b based on matching impression identifiers 140 stored in both ofthe user information 102 a-b. The example AME 108 can use such matchingimpression identifiers 140 across the user information 102 a-b to avoidovercounting mobile devices and/or users (e.g., by only counting uniqueusers instead of counting the same user multiple times).

A same user may be counted multiple times if, for example, an impressioncauses the client device 106 to send multiple user/device identifiers tomultiple different database proprietors 104 a-b without an impressionidentifier (e.g., the impression identifier 140). For example, a firstone of the database proprietors 104 a sends first user information 102 ato the AME 108, which signals that an impression occurred. In addition,a second one of the database proprietors 104 b sends second userinformation 102 b to the AME 108, which signals (separately) that animpression occurred. In addition, separately, the client device 106sends an indication of an impression to the AME 108. Without knowingthat the user information 102 a-b is from the same impression, the AME108 has an indication from the client device 106 of a single impressionand indications from the database proprietors 104 a-b of multipleimpressions.

To avoid overcounting impressions, the AME 108 can use the impressionidentifier 140. For example, after looking up user information 102 a-b,the example partner database proprietors 104 a-b transmit the impressionidentifier 140 to the AME 108 with corresponding user information 102a-b. The AME 108 matches the impression identifier 140 obtained directlyfrom the client device 106 to the impression identifier 140 receivedfrom the database proprietors 104 a-b with the user information 102 a-bto thereby associate the user information 102 a-b with the mediaidentifier 122 and to generate impression information. This is possiblebecause the AME 108 received the media identifier 122 in associationwith the impression identifier 140 directly from the client device 106.Therefore, the AME 108 can map user data from two or more databaseproprietors 104 a-b to the same media exposure event, thus avoidingdouble counting.

Each unique impression identifier 140 in the illustrated example isassociated with a specific impression of media on the client device 106.The partner database proprietors 104 a-b receive the respectiveuser/device identifiers 124 and generate the user information 102 a-bindependently (e.g., without regard to others of the partner databaseproprietors 104 a-b) and without knowledge of the media identifier 122involved in the impression. Without an indication that a particular userdemographic profile in the user information 102 a (received from thepartner database proprietor 104 a) is associated with (e.g., the resultof) the same impression at the client device 106 as a particular userdemographic profile in the user information 102 b (received from thepartner database proprietor 104 b independently of the user information102 a received from the partner database proprietor 104 a), and withoutreference to the impression identifier 140, the AME 108 may not be ableto associate the user information 102 a with the user information 102 band/or cannot determine that the different pieces of user information102 a-b are associated with a same impression and could, therefore,count the user information 102 a and the user information 102 b ascorresponding to two different users/devices and/or two differentimpressions.

The above examples illustrate methods and apparatus for collectingimpression data at an audience measurement entity (or other entity). Theexamples discussed above may be used to collect impression informationfor any type of media, including static media (e.g., advertisingimages), streaming media (e.g., streaming video and/or audio, includingcontent, advertising, and/or other types of media), and/or other typesof media. For static media (e.g., media that does not have a timecomponent such as images, text, a webpage, etc.), the example AME 108records an impression once for each occurrence of the media beingpresented, delivered, or otherwise provided to the client device 106.For streaming media (e.g., video, audio, etc.), the example AME 108measures demographics for media occurring over a period of time. Forexample, the AME 108 (e.g., via the app publisher 110 and/or the mediapublisher 120) provides beacon instructions to a client application orclient software (e.g., the OS 114, the web browser 117, the app 116,etc.) executing on the client device 106 when media is loaded at clientapplication/software 114-117. In some examples, the beacon instructionsare embedded in the streaming media and delivered to the client device106 via the streaming media. In some examples, the beacon instructionscause the client application/software 114-117 to transmit a request(e.g., a pingback message) to an impression monitoring server 132 atregular and/or irregular intervals (e.g., every minute, every 30seconds, every 2 minutes, etc.). The example impression monitoringserver 132 identifies the requests from the web browser 117 and, incombination with one or more database proprietors, associates theimpression information for the media with demographics of the user ofthe web browser 117.

In some examples, a user loads (e.g., via the browser 117) a web pagefrom a web site publisher, in which the web page corresponds to aparticular 60-minute video. As a part of or in addition to the exampleweb page, the web site publisher causes the data collector 112 to send apingback message (e.g., a beacon request) to a beacon server 142 by, forexample, providing the browser 117 with beacon instructions. Forexample, when the beacon instructions are executed by the examplebrowser 117, the beacon instructions cause the data collector 112 tosend pingback messages (e.g., beacon requests, HTTP requests, pings) tothe impression monitoring server 132 at designated intervals (e.g., onceevery minute or any other suitable interval). The example beaconinstructions (or a redirect message from, for example, the impressionmonitoring server 132 or a database proprietor 104 a-b) further causethe data collector 112 to send pingback messages or beacon requests toone or more database proprietors 104 a-b that collect and/or maintaindemographic information about users. The database proprietor 104 a-btransmits demographic information about the user associated with thedata collector 112 for combining or associating with the impressiondetermined by the impression monitoring server 132. If the user closesthe web page containing the video before the end of the video, thebeacon instructions are stopped, and the data collector 112 stopssending the pingback messages to the impression monitoring server 132.In some examples, the pingback messages include timestamps and/or otherinformation indicative of the locations in the video to which thenumerous pingback messages correspond. By determining a number and/orcontent of the pingback messages received at the impression monitoringserver 132 from the client device 106, the example impression monitoringserver 132 can determine that the user watched a particular length ofthe video (e.g., a portion of the video for which pingback messages werereceived at the impression monitoring server 132).

The client device 106 of the illustrated example executes a clientapplication/software 114-117 that is directed to a host website (e.g.,www.acme.com) from which the media 118 (e.g., audio, video, interactivemedia, streaming media, etc.) is obtained for presenting via the clientdevice 106. In the illustrated example, the media 118 (e.g.,advertisements and/or content) is tagged with identifier information(e.g., a media ID 122, a creative type ID, a placement ID, a publishersource URL, etc.) and a beacon instruction. The example beaconinstruction causes the client application/software 114-117 to requestfurther beacon instructions from a beacon server 142 that will instructthe client application/software 114-117 on how and where to send beaconrequests to report impressions of the media 118. For example, theexample client application/software 114-117 transmits a requestincluding an identification of the media 118 (e.g., the media identifier122) to the beacon server 142. The beacon server 142 then generates andreturns beacon instructions 144 to the example client device 106.Although the beacon server 142 and the impression monitoring server 132are shown separately, in some examples the beacon server 142 and theimpression monitoring server 132 are combined. In the illustratedexample, beacon instructions 144 include URLs of one or more databaseproprietors (e.g., one or more of the partner database proprietors 104a-b) or any other server to which the client device 106 should sendbeacon requests (e.g., impression requests). In some examples, apingback message or beacon request may be implemented as an HTTPrequest. However, whereas a transmitted HTTP request identifies awebpage or other resource to be downloaded, the pingback message orbeacon request includes the audience measurement information (e.g., adcampaign identification, content identifier, and/or device/useridentification information) as its payload. The server to which thepingback message or beacon request is directed is programmed to log theaudience measurement data of the pingback message or beacon request asan impression (e.g., an ad and/or content impression depending on thenature of the media tagged with the beaconing instructions). In someexamples, the beacon instructions received with the tagged media 118include the beacon instructions 144. In such examples, the clientapplication/software 114-117 does not need to request beaconinstructions 144 from a beacon server 142 because the beaconinstructions 144 are already provided in the tagged media 118.

When the beacon instructions 144 are executed by the client device 106,the beacon instructions 144 cause the client device 106 to send beaconrequests (e.g., repeatedly at designated intervals) to a remote server(e.g., the impression monitoring server 132, the media publisher 120,the database proprietors 104 a-b, or another server) specified in thebeacon instructions 144. In the illustrated example, the specifiedserver is a server of the audience measurement entity 108, namely, atthe impression monitoring server 132. The beacon instructions 144 may beimplemented using Javascript or any other types of instructions orscript executable via a client application (e.g., a web browser)including, for example, Java, HTML, etc.

FIG. 2 illustrates an example system 200 to verify and/or correctimpression data 130 corresponding to media accessed on mobile devices106 of panelist households. In the illustrated example, the system 200generates corrected demographic impressions used to generate and/orcalibrate correction factors to correct misattribution errors inaggregate census data provided by the partner database proprietors(e.g., the partner database proprietors 104 a, 104 b of FIG. 1). In theillustrated example, the system 200 collects impression data 130 fromone or more mobile devices, one of which is shown as the mobile device106. The system 200 of the illustrated example uses the impression data130 to create demographic impression data corresponding to mediaaccessed via the mobile device 106.

In the illustrated example, the mobile device 106 is used by multipleusers (e.g., a primary user 202, one or more secondary users 204, etc.).In the illustrated example, the primary user 202 is a member of anelectronic mobile panel (EMM) formed and maintained by the AME 108 andthe secondary users 204 are members of the same household as the primaryuser 202. In the illustrated example, when the AME 108 enrolls theprimary user 202 as an EMM panelist, the AME 108 collects detaileddemographic information (e.g., age, gender, occupation, salary, raceand/or ethnicity, marital status, highest completed education, currentemployment status, etc.) about the primary user 202. In the illustratedexample, when the primary user 202 is enrolled as an EMM panelist or atany later date, the AME 108 also collects supplemental information(sometime referred to herein as a “supplemental survey”) from theprimary user 202. The supplemental information may include detaileddemographic information of the secondary users 204, informationregarding mobile device(s) 106 in the household (e.g., types ofdevice(s), device identifier(s), etc.), information regarding usagehabits of the mobile device(s) 106 (e.g., which member of the householduses which mobile device, whether mobile device(s) 106 is/are shared,etc.), information regarding usage of database proprietors (e.g., whichmembers of the household use which services, etc.), etc. In someexamples, the AME 108 assigns an EMM panelist identifier 206 to theprimary user 202.

The system 200 uses device/user identifier(s) 124 and/or the EMMpanelist identifier 206 included with the impression data 130 toidentify impression data 130 for media accessed on a mobile device 106known to belong to the primary user 202. In some examples, the AME 108may set a cookie value on the mobile device 106 when the primary user202 logs into a service of the AME 108 using credentials (e.g., usernameand password) corresponding to the primary user 202. In some examples,the device/user identifier(s) 124 corresponding to the mobile device 106and/or the primary user 202 may be supplied to the AME 108 when theprimary user enrolls as an EMM panelist. In some examples, the AME 108may supply a meter or data collector (e.g., the data collector 112 ofFIG. 1) integrated into one or more media viewing apps on the mobiledevice which provide impression data to the AME 108 with the panelistidentifier 206 assigned to the primary user 202. In the illustratedexample the AME 108 uses the device/user identifier(s) 124 and/or theEMM panelist identifier 206 to pair the impression data 130 with thedemographic data of the primary user 202.

In some instances, the mobile device 106 is shared with one or both ofthe secondary users 204. In some examples, when a secondary user 204uses the mobile device 106, the impression data 130 reported by thedevice 106 includes identifier(s) (e.g., the cookie value, thedevice/user identifier(s) 124, the EMM panelist identifier 206, etc.)corresponding to the primary user 202. In such examples, the AME 108incorrectly attributes an impression based on the received impressiondata 130 to the primary user 202 based on the included identifier(s).This creates a misattribution error because the logged impressioncorresponds to the secondary user 204 but is logged in association withthe demographic information of the primary user 202. The example system200 of FIG. 2 is configured to correct logged impressions for suchmisattribution errors.

In the illustrated example of FIG. 2, the impression server 132 logsimpressions based on the impression data 130 in connection withdemographic information supplied by the primary user 202 (e.g., the EMMpanelist) to generate electronic mobile measurement census (EMMC) data.

The AME 108 of the illustrated example of FIG. 2 includes an impressioncorrector 210 to verify and/or correct demographic information pairedwith impression data 130 originating from mobile devices 106 of EMMpanelists 202 to produce corrected EMM census data (e.g., AAM-adjustedEMM census data). The example AME 108 maintains a panelist database 212to store information (e.g., demographic information of the EMM panelist202, supplemental survey data, EMM panelist ID(s), device IDs, etc.)related to the EMM panelist 202 and the mobile device 106. The exampleimpression corrector 210 uses the information stored in the panelistdatabase 212 to determine whether the impression data 130 wasmisattributed to the EMM panelist 202 and, if so, which secondary user's204 demographic information should be associated with the impressiondata 130 instead. After the impression data 130 is verified and/orcorrected by the impression corrector 210, the AAM-adjusted EMM censusdata is stored in the EMM census database 214. In some examples, theAAM-adjusted EMM census data in the EMMC database 214 is used togenerate an EMM census report 208 and/or is used to calibratemisattribution correction factors that can be used to correct aggregateimpression data provided by database proprietors such as the userinformation 102 a-b provided by the database proprietors 104 a-b of FIG.1.

The example AME 108 includes an assignment modeler 216 to generate anactivity assignment model for use by the impression corrector 210 tocalculate probability scores used to verify and/or correct demographicinformation paired with the impression data 130 of EMM panelists 202 inthe impressions store 134. A probability score represents theprobability that a person (e.g., the EMM panelist 202, the householdmembers 204, etc.) with certain attributes (e.g., demographicinformation) accessed a television program with certain attributes. Theexample assignment modeler 216 retrieves historic exposure data storedin a exposure database 218 to generate the activity assignment model. Insome examples, the historic exposure data stored in the exposuredatabase 218 includes historic television exposure census data and/orhistoric EMM census data. In some examples, from time to time (e.g.,aperiodically, every six months, every year, etc.), the assignmentmodeler 216 regenerates the activity assignment model with more currenthistoric data. As a result, the exposure database 218 may only retainexposure data for a limited number (e.g., two, four, etc.) of televisionseasons (e.g., the fall season and the spring season, etc.).

In some examples, to generate the activity assignment model, theassignment modeler 216 retrieves the census data stored in the exposuredatabase 218 and selects a majority portion (e.g., 65%, 80%, etc.) ofthe census data to be one or more training datasets. In such examples,the remaining portion (e.g., 35%, 20%, etc.) is designated as avalidation set. In some examples, the census data stored in the exposuredatabase 218 may be split into a number of subsets (e.g., five subsetswith 20% of the census data, etc.). In some such examples, one subsetmay be selected as the validation set and the remaining subsets form thetraining set. In some such examples, the activity model may be trainedand validated multiple times (e.g., cross-validated with differentsubsets being the validation subset).

In the illustrated examples of FIG. 2, the example assignment modeler216 selects attributes associated with the census data (e.g.,demographic information, day of the week, program genre, programlocality (e.g., nation, local, etc.), etc.) and generates the activityassignment model using modeling techniques, such as a gradient boostregression modeling technique, a k-nearest neighbor modeling technique,etc. In some examples, different activity assignment models may begenerated for different localities (e.g., local broadcast, national,etc.). For example, the assignment modeler 216 may generate a localactivity assignment model for local television programing and a nationalactivity assignment model for nation television programming. In someexamples, the assignment modeler 216 determines an accuracy of theactivity assignment model by running the validation set through theactivity assignment model and then comparing the probability scores ofmembers of the households in the validation set as calculated by theactivity assignment model with the actual members of the householdscontained in the validation set. For example, for a particularprobability score calculation performed using the activity assignmentmodel, the activity assignment model is considered correct or acceptableif the member of a household with the highest probability scorecalculated using the activity assignment model matches the actual memberof the household identified by the corresponding record in thevalidation set. In some examples, the activity assignment model isaccepted if the accuracy of the activity assignment model satisfies(e.g., is greater than or equal to) a threshold (e.g., 50%, 65%, etc.).In such examples, if the accuracy of the activity assignment model doesnot satisfy the threshold, the activity assignment model is regeneratedby the assignment modeler 216 using a different combination ofattributes.

The example impression corrector 210 uses the activity assignment modelto calculate probability scores. The calculated probability scores areused to determine whether the demographic information of the EMMpanelist 202 is correctly associated with the impression data 130 in theimpression store 134, or whether demographic information of one of theother members of the household 204 should be assigned to the impressiondata 130 instead. In the illustrated example, a probability score(P_(S)) is calculated in accordance with Equation 1 below.

$\begin{matrix}{{P_{Sh} = {AA{M\left( {A_{1h},A_{2h},A_{3h},{\ldots\mspace{11mu} A_{nh}}} \right)}}},} & {{Equation}\mspace{14mu} 1}\end{matrix}$

In Equation 1 above, P_(Sh) is the probability score that householdmember (e.g., the household members 202, 204) h is the personcorresponding to a particular impression, AAM is the activity assignmentmodel, and A_(1h) through A_(nh) are the attributes of the householdmember h and attributes of the program associated with the impressiondata 130 that correspond to the attributes used to generate the activityassignment model (AAM).

FIG. 3 illustrates an example table 300 depicting attributes 302 used togenerate the activity assignment model. The example table 300 alsoincludes example influence values 304 (e.g., influence weights) forcorresponding attributes 302 that indicate how much influence eachattribute 302 contributes in the activity assignment model. The exampleinfluence 304 is calculated after the activity assignment model isgenerated by the assignment modeler 216 (FIG. 2). In the illustratedexample of FIG. 3, the influence 304 is a relative value (e.g., the sumof all of the influences 304 is 100%) that is used to remove attributesfrom the activity assignment model. A higher influence value 304indicates that a corresponding attribute 302 has a higher influence onthe probability score (but does not show how a particular attributecontributes to an individual probability score). For example, if anattribute 302 has a corresponding influence value 304 of 52%, then thevalue of the attribute 302 has a determinative effect in 52% of theprobability scores. In some examples, the attributes 302 with thehighest influence values 304 are retained in the activity assignmentmodel. In some examples, after selecting the attributes 302 with thehighest influence values 304, the assignment modeler 216 regenerates theactivity assignment model using the selected attributes 302. In theillustrated example of FIG. 3, the six attributes 302 with the highestinfluence values 304 (e.g., age group, household size, daypart,household race/ethnicity, gender, day of the week) are selected for theactivity assignment model. The attributes 302 may be selected by, forexample, selecting a number of attributes 302 with the highest influence304, or by selecting a number of attributes 302 that add up to athreshold percentage of influence 304, or by using any other suitableselection technique by which selections of attributes 302 contribute togenerating an activity assignment model of a desired performance.

Returning to the illustrated example of FIG. 2, after generating theactivity assignment model, the example assignment modeler 216 evaluatesthe activity assignment model with the validation set. If accuracy ofthe activity assignment model satisfies a threshold, the assignmentmodeler 216 provides the activity assignment model to the impressioncorrector 210. If accuracy of the activity assignment model does notsatisfy the threshold, the assignment modeler 216 selects a differentcombination of attributes (e.g., the attributes 300 of FIG. 3), andregenerates and revalidates the activity assignment model. In someexamples, the threshold is based on an error tolerance of customers ofthe AME 108 and/or a known amount of error in the training data set.

FIG. 4 illustrates an example implementation of the example impressioncorrector 210 of FIG. 2 to verify and/or correct demographic informationassociated with the impression data 130 (FIG. 1) generated by a mobiledevice 106 (FIG. 1) belonging to the EMM panelist 202 (FIG. 2). Theexample impression corrector 210 includes an example panelist identifier400, an example probability calculator 402, an example calibrationcalculator 404, and an example impression designator 406. The examplepanelist identifier 400 retrieves impression data 130 from theimpression store 134. Using a user/device identifier 126 (FIG. 1) and/oran EMM panelist ID 206 (FIG. 2) included in the retrieved impressiondata 130, the example panelist identifier 400 retrieves demographic datafor the EMM panelist 202 and the other household member(s) 204. Theexample panelist identifier 400 also uses the user/device identifier 126to retrieve device information (e.g., device type, etc.) of the clientdevice 106 from the panelist database 212.

In the illustrated example of FIG. 4, the calibration calculator 404calculates a calibration factor (λ) for the EMM panelist 204. Thecalibration factor (λ) is used to determine the likelihood that the EMMpanelist 202 was exposed to the media associated with the retrievedimpression data 130. Calibration factors (λ) greater than one (λ>1)signify that the EMM panelist 202 is more likely to be the personassociated with the impression data 130 compared to the other householdmembers 204. Calibration factors (λ) less than one (λ<1) mean that theEMM panelist 202 is less likely to be the person associated with theimpression data 130 compared to the other household members 204.Calibration factors (λ) equal to one (λ=1) mean that the EMM panelist202 is as likely to be the person associated with the impression data130 as the other household members 204.

The calibration factor (λ) is based on the demographic group of the EMMpanelist 204 and the type of portable device 106 on which the media wasaccessed. The demographic groups are defined by demographic information,genre of the media presentation and/or type of mobile device. Forexample, one demographic group may be “Hispanic, female, age 30-34,iPad® tablet device” while another demographic group may be “Hispanic,female, age 30-34, Android™ smartphone device.” In the illustratedexample, a calibration factor (λ) is calculated using Equation 2 below.

$\begin{matrix}{{\lambda = \frac{T_{AVG}\left( {{EMM}\mspace{14mu}{panelist}\mspace{14mu}{demo}\mspace{14mu}{group}} \right)}{T_{AVG}\left( {{all}\mspace{14mu}{demo}\mspace{14mu}{groups}} \right)}},} & {{Equation}\mspace{14mu} 2}\end{matrix}$

In Equation 2 above, T_(AVG)(EMM panelist demo group) is the averagetime the EMM panelist's demographic (demo) group is exposed to the mediapresentation or media genre associated with the impression data 130, andT_(AVG)(all demo groups) is the average time across all demographic(demo) groups associated with exposure to the media presentation ormedia genre associated with the impression data 130. For example, if theT_(AVG)(Hispanic, female, age 30-34, iPad tablet device) is 0.5 hoursand the T_(AVG)(all demo groups) is 0.34 hours, the calibration factorwould be 1.47 (0.5/0.34=1.47).

The example probability calculator 402 receives or retrieves thehousehold information and the device information from the panelistidentifier 400. In some examples, the probability calculator 402 firstdetermines whether the EMM panelist 202 is to be attributed to theexposure of the media associated with the impression data 130 beforecalculating a probability score for any other member of the household204. In some such examples, the probability calculator 402 determinesthat the EMM panelist 202 is to be attributed to the exposure if (i) theEMM panelist 202 is the only member of the household, (ii) the EMMpanelist 202 has indicated (e.g., when recruited as a panelist, on asupplemental survey, etc.) that the particular portable device 106 isnot shared, or (iii) if the probability score of the EMM panelist 202(P_(Sp)) satisfies the criterion indicated in Equation 3 below.

$\begin{matrix}{P_{Sp} \geq \frac{\lambda}{HH}} & {{Equation}\mspace{14mu} 3}\end{matrix}$

In Equation 3 above, HH is the size of the EMM panelist's 204 household.For example, if the probability score (P_(Sp)) of the EMM panelist 204,as calculated by the activity assignment model, is 0.42, the calibrationfactor (λ) is 1.47, and the size of the household (HH) is 4, theprobability calculator 402 confirms the EMM panelist 202 is to beattributed to the exposure (0.42≥1.47/4).

In the illustrated example of FIG. 4, if the probability calculator 402determines that the EMM panelist 202 is not to be attributed to theexposure, the probability calculator 402 calculates a probability score,using the activity assignment model, for every other member 204 of theEMM panelist's 202 household. In some examples, the probabilitycalculator 402 selects the member 204 of the household with the highestprobability score.

In the illustrated example of FIG. 4, the impression designator 406replaces the demographic information associated with the impression datawith the demographic information of the person (e.g., the EMM panelist202, the member of the household 204) selected by the probabilitycalculator 402 to form the AAM-adjusted EMM census data. The exampleimpression designator 406 stores the AAM-adjusted EMM census data in theEMM census database 214 and/or includes the EMM census data on an EMMCreport 208.

While an example manner of implementing the example impression corrector210 of FIG. 2 is illustrated in FIG. 4, one or more of the elements,processes and/or devices illustrated in FIG. 4 may be combined, divided,re-arranged, omitted, eliminated and/or implemented in any other way.Further, the example panelist identifier 400, the example probabilitycalculator 402, the example calibration calculator 404, the exampleimpression designator 406 and/or, more generally, the example impressioncorrector 210 of FIG. 2 may be implemented by hardware, software,firmware and/or any combination of hardware, software and/or firmware.Thus, for example, any of the example panelist identifier 400, theexample probability calculator 402, the example calibration calculator404, the example impression designator 406 and/or, more generally, theexample impression corrector 210 could be implemented by one or moreanalog or digital circuit(s), logic circuits, programmable processor(s),application specific integrated circuit(s) (ASIC(s)), programmable logicdevice(s) (PLD(s)) and/or field programmable logic device(s) (FPLD(s)).When reading any of the apparatus or system claims of this patent tocover a purely software and/or firmware implementation, at least one ofthe example panelist identifier 400, the example probability calculator402, the example calibration calculator 404, and/or the exampleimpression designator 406 is/are hereby expressly defined to include atangible computer readable storage device or storage disk such as amemory, a digital versatile disk (DVD), a compact disk (CD), a Blu-raydisk, etc. storing the software and/or firmware. Further still, theexample impression corrector 210 of FIG. 2 may include one or moreelements, processes and/or devices in addition to, or instead of, thoseillustrated in FIG. 4, and/or may include more than one of any or all ofthe illustrated elements, processes and devices.

FIG. 5 depicts an example system 500 to use EMM census data 502 tocalibrate misattribution correction factors used to correct themisattributions associated with census data 504. In the illustratedexample, the census data 504 includes census data from various sources(e.g., aggregate census data provided by database proprietors, censusdata from monitoring EMM panelists, etc.). The example EMM census data502 is a subset of the census data 504 that contains census data frommonitoring EMM panelists. In some examples, the EMM census data 502 ishousehold-level impression data (e.g., impressions that are allassociated with the EMM panelist of a household) and not user-levelimpression data (e.g., impressions that are associated with individualhousehold members).

In the illustrated example, the impression corrector 210 uses anactivity assignment model 506 (e.g., the activity assignment modelgenerated by the assignment modeler 216 of FIG. 2) to verify and/orcorrect demographic data associated with EMM census data 502 to produceAAM-adjusted EMM census data 503. For example, some EMM census data 502may be collected before the EMM panelist 202 (FIG. 2) returns asupplemental survey response to the AME 108 (FIG. 1) providingdemographic information about members of the EMM panelist's household.In that example, before receiving a supplemental survey response, theimpressions 130 (FIG. 1) would be associated with the EMM panelist 202,even though another household member 204 actually accessed the media. Asanother example, an updated supplemental survey response may besubmitted by the EMM panelist 202 if usage habits of the portable device106 (FIG. 1) change and/or if the composition of the household changes.In some examples, the EMM census data 502 may be reprocessed uponreceiving a new and/or updated household demographic survey. As anotherexample, impression data 130 may be assigned to the EMM panelist 202that owns the mobile device 106 in an initial processing phase (e.g.,when the impression request is received by the AME 108, etc.) and thenmay be verified and/or corrected in a post-processing phase. In theillustrated example, the AAM-adjusted EMM census data 503 is stored inthe EMM census database 214.

Initially, in some examples, the example sharing matrix generator 508calculates device sharing matrices based on probability survey data froma probability survey database 510. The probability survey is a surveyconducted on randomly selected people and/or households. In someexamples, the selected people and/or households are selected frompanelists enrolled in one or more panels with the AME 108. Alternativelyor additionally, the selected people and/or households that are notenrolled in an AME panel are randomly selected (e.g., via phonesolicitations, via Internet advertisements, etc.). In some instances,the probability survey is a survey conducted on non-panelist householdsbecause panelist households are a relatively small portion of thepopulation and the census data 504 includes demographic impressions ofnon-panelist households. In the illustrated example, the probabilitysurvey includes information about demographics of each member of thehousehold, type of devices in the household, which members of thehousehold use which devices, media viewing preferences, which members ofthe household are registered with which database proprietors, etc.However, the probability survey data 510 does not include detailedviewing habits of the surveyed household (e.g., which member of thehousehold is responsible for which percentage of genre-specific mediaaccess, etc.).

To illustrate, consider the following example. An example non-panelisthousehold from which a probability survey is conducted includes fourmembers: 1) a 35-39 year old male, 2) a 40-44 year old female, 3) a12-14 year old male, and 4) a 9-11 year old male. On the probabilitysurvey, the 35-39 year old male and the 12-14 year old male indicatethat they have registered with an example database proprietor (e.g.,Facebook, Google, Yahoo!, etc.) and access, from time to time, thedatabase proprietor via a tablet computer (e.g., the mobile device 106of FIG. 2). The probability survey indicates which genre of media eachof the members of the household access on the tablet. Table 1 belowillustrates an example exposure pattern for the tablet (e.g., an “X”indicates that the member of the family is exposed to media of thatgenre on the tablet).

TABLE 1 EXAMPLE EXPOSURE PATTERN FOR A TABLET BY MEDIA GENRE IN ANEXAMPLE HOUSEHOLD BASED ON PROBABILITY SURVEY DATA Demographic GroupsContent Type M35-39 F40-44 M12-14 M9-11 All X X X X Political X X DramaX Kids X Comedy X X

Using the probability survey data, the example sharing matrix generator508 generates device sharing probabilities (sometimes referred to asprobability density functions (PDFs)) that the person identified in thedemographic group in the household is exposed to the type of content(e.g., media genre) on the device. Table 2 below illustrates devicesharing probabilities based on the example household. Device sharingprobabilities are the probability that a member of an age-baseddemographic group accessed a specific genre of media on the mobiledevice. Because the probability survey data 510 does not includedetailed exposure information, in the illustrated example, the sharingmatrix generator 508 assumes that the members of the household that areexposed the indicated media genre are exposed to it equally.

TABLE 2 EXAMPLE DEVICE SHARING PROBABILITIES BY MEDIA GENRE IN ANEXAMPLE HOUSEHOLD BASED ON PROBABILITY SURVEY DATA Demographic GroupsContent Type M35-39 F40-44 M12-14 M9-11 All 0.25 0.25 0.25 0.25Political 0.50 0.50 — — Drama — 1.00 — — Kids — — — 1.00 Comedy 0.50 —0.50 —For example, according to Table 2 above, if impression data forpolitical media is received from the above-described household, theprobability that the male age 35-39 accessed the media is 50%. However,actual sharing probabilities within a household may be different. Forexample, the 35-39 year old male may be exposed to 75% of the politicalmedia on the tablet, while the 40-44 year old female may only be exposedto 25% of the political media on the tablet.

To provide more detailed device sharing probabilities, the AAM-adjustedEMM census 503 data in the example EMM census database 214 containsdetailed exposure information (e.g., the impressions 130 of FIG. 1)paired with detailed demographic information. To illustrate, considerthe following example. An example panelist household from whichAAM-adjusted EMM census data 503 is collected includes four members: 1)a 30-34 year old male, 2) a 30-34 year old female (who is the EMMpanelist), 3) a 12-14 year old male, and 4) a 12-14 year old female.During the EMM panel enrollment process (e.g., via a supplementalsurvey), the 30-34 year old female EMM panelist indicates that a tabletcomputer (e.g., the device 106 of FIG. 2) is shared with the entirehousehold.

The demographic impressions generated by the tablet are processed by theimpression corrector 210 to verify and/or correct demographicinformation associated with the demographic impressions to produceAAM-adjusted EMM census data 503. In the illustrated example, thesharing matrix generator 508 analyzes the AAM-adjusted EMM census data503 for the household. For example, the sharing matrix generator 508may, for a household, retrieve the impression data in the AAM-adjustedEMM census data 503 for specific media genre (e.g., political, drama,kids, comedy, etc.). The sharing matrix generator 508 may then calculatewhat percentage the specific media genre was accessed by each member ofthe household. In such an example, the sharing matrix generator 508 mayrepeat this process until the percentages are calculated for each genreof interest to generate device sharing probabilities, as shown in Table3 below.

TABLE 3 EXAMPLE DEVICE SHARING PROBABILITIES BY MEDIA GENRE IN ANEXAMPLE HOUSEHOLD BASED ON AAM-ADJUSTED EMM CENSUS DATA DemographicGroups Content Type M30-34 F30-34 M12-14 F12-14 All 0.34 0.66 — —Political 0.75 0.25 — — Drama — 0.91 — 0.09 Kids — — 0.56 0.44 Comedy0.23 — 0.45 0.32For example, according to Table 3 above, if impression data forpolitical media is received from the above-described household, theprobability that the male, age 30-34, accessed the media is 75%. Asanother example, if impression data for political media is received fromthe above described household that is associated with the female, age12-14, 75% of such impression data should be associated with the male,age 30-34, and 25% of such impression data should be associated with thefemale, age 30-34, instead.

In some examples, the sharing matrix generator 508 calculates devicesharing probabilities using the AAM-adjusted EMM census data 503 tocalibrate the device sharing probabilities calculated for theprobability surveys. For example, instead of assigning an equalprobability to household members that are exposed to a particular mediagenre on a mobile device, the AME 108 may assign weighted probabilitiesof being exposed to the genre on the device based on the device sharingmatrices calculated by the sharing matrix generator 508 using theAAM-adjusted EMM census data 503. In some examples, the sharing matrixgenerator 508 may use both of the device sharing probabilities generatedbased on the AAM-adjusted EMM census data 503 and the probability surveydata 510 to generate misattribution correction factors. In some suchexamples, the device sharing probabilities may be weighted according tocontribution to the misattribution correction factors. For example, ifsix thousand matrices of device sharing probabilities based onprobability survey data 510 are used and four thousand matrices ofdevice sharing probabilities based on the AAM-adjusted EMM census data503, the device sharing matrixes based on probability survey data 510would be weighted by 0.6 (6,000/(6,000+4,000)), while device sharingmatrixes based on the AAM-adjusted EMM census data 503 would be weightedby 0.4 (4,000/(6,000+4,000)).

Examples for using device sharing probabilities to generate correctionfactors to correct misattribution errors are disclosed in U.S. patentapplication Ser. No. 14/560,947, filed Dec. 4, 2014, entitled “Methodsand Apparatus to Compensate Impression Data for Misattribution and/orNon-Coverage by a Database Proprietor,” which is incorporated herein byreference in its entirety.

A flowchart representative of example machine readable instructions forimplementing the example impression corrector 210 of FIGS. 2, 4, and/or5 is shown in FIG. 6. A flowchart representative of example machinereadable instructions for implementing the example assignment modeler216 of FIG. 2 is shown in FIG. 7. In these examples, the machinereadable instructions comprise program(s) for execution by a processorsuch as the processor 812 shown in the example processor platform 800discussed below in connection with FIG. 8. The program may be embodiedin software stored on a tangible computer readable storage medium suchas a CD-ROM, a floppy disk, a hard drive, a digital versatile disk(DVD), a Blu-ray disk, or a memory associated with the processor 812,but the entire program and/or parts thereof could alternatively beexecuted by a device other than the processor 812 and/or embodied infirmware or dedicated hardware. Further, although the example programsare described with reference to the flowcharts illustrated in FIGS. 6and 7, many other methods of implementing the example impressioncorrector 210 and/or the example assignment modeler 216 mayalternatively be used. For example, the order of execution of the blocksmay be changed, and/or some of the blocks described may be changed,eliminated, or combined.

As mentioned above, the example processes of FIGS. 6 and 7 may beimplemented using coded instructions (e.g., computer and/or machinereadable instructions) stored on a tangible computer readable storagemedium such as a hard disk drive, a flash memory, a read-only memory(ROM), a compact disk (CD), a digital versatile disk (DVD), a cache, arandom-access memory (RAM) and/or any other storage device or storagedisk in which information is stored for any duration (e.g., for extendedtime periods, permanently, for brief instances, for temporarilybuffering, and/or for caching of the information). As used herein, theterm tangible computer readable storage medium is expressly defined toinclude any type of computer readable storage device and/or storage diskand to exclude propagating signals and to exclude transmission media. Asused herein, “tangible computer readable storage medium” and “tangiblemachine readable storage medium” are used interchangeably. Additionallyor alternatively, the example processes of FIGS. 6 and 7 may beimplemented using coded instructions (e.g., computer and/or machinereadable instructions) stored on a non-transitory computer and/ormachine readable medium such as a hard disk drive, a flash memory, aread-only memory, a compact disk, a digital versatile disk, a cache, arandom-access memory and/or any other storage device or storage disk inwhich information is stored for any duration (e.g., for extended timeperiods, permanently, for brief instances, for temporarily buffering,and/or for caching of the information). As used herein, the termnon-transitory computer readable medium is expressly defined to includeany type of computer readable storage device and/or storage disk and toexclude propagating signals and to exclude transmission media. As usedherein, when the phrase “at least” is used as the transition term in apreamble of a claim, it is open-ended in the same manner as the term“comprising” is open ended.

FIG. 6 is a flow diagram representative of example machine readableinstructions 600 that may be executed to implement the exampleimpression corrector 210 of FIGS. 2 and 4 to verify and/or correctassociations of demographic information with impression data 130(FIG. 1) collected from a portable device 106 (FIG. 1) belonging to anEMM panelist 202 (FIG. 2). Initially, at block 602, the panelistidentifier 400 (FIG. 4) retrieves impression data 130 from theimpressions store 134 (FIG. 1). At block 604, based on EMM panelist ID206 (FIG. 2) and/or user/device identifier 124 (FIG. 1) included in theimpression data 130 retrieved at block 602, the panelist identifier 400retrieves information (e.g., demographic information, unique identifier,etc.) for the primary user 202 (e.g., the EMM panelist) and thesecondary user(s) 204 (e.g., the members of the EMM panelist'shousehold), and information (e.g., device type, etc.) for the portabledevice 106. At block 606, the probability calculator 402 (FIG. 4)determines if the size of the primary user's 202 household is equal toone (e.g., the primary user 202 lives alone). If the size of the primaryuser's 202 household is one, the program control advances to block 616at which the impression designator 406 (FIG. 4) associates thedemographic information of the primary user 202 with the impression data130 to create AAM-adjusted EMM census data 503 (FIG. 5). Otherwise, ifthe size of the primary user's 202 household is not one, program controladvances to block 608.

At block 608, the probability calculator 402 determines if the primaryuser 202 indicated that he/she does not share the particular portabledevice 106 identified at block 604. If the primary user 202 indicatedthat he/she does not share the particular portable device 106, programcontrol advances to block 616, at which the impression designator 406associates the demographic information of the primary user 202 with theimpression data 130 to create AAM-adjusted EMM census data 503.Otherwise, if the primary user 202 indicated that he/she does share theparticular portable device 106, program control advances to block 610.At block 610, the probability calculator 402 calculates a probabilityscore for the primary user 202. In some examples, the probabilitycalculator 402 calculates the probability score in accordance withEquation 1 above. At block 612, the calibration calculator 404 (FIG. 4)retrieves (e.g., from a pre-calculated table, etc.) or calculates acalibration factor (λ) based on the demographic information of theprimary user 202 and the device type of the portable device 106. In someexamples, the calibration factor (λ) is calculated in accordance withEquation 2 above. The example calibration calculator 404 calculates athreshold based on the calibration factor (k). In some examples, thethreshold is equal to

$\frac{\lambda}{HH},$

where HH is the size of the primary user's household.

At block 614, the probability calculator 402 determines whether theprobability score of the primary user 202 satisfies the threshold. Insome examples, whether the probability score of the primary user 202satisfies the threshold is determined in accordance to Equation 3 above.If the probability score of the primary user 202 satisfies thethreshold, program control advances to block 616. Otherwise, if theprobability score of the primary user 202 does not satisfy thethreshold, program control advances to block 618. At block 616, theimpression designator 406 (FIG. 4) associates the demographicinformation of the primary user 202 with the impression data 130 tocreate AAM-adjusted EMM census data 503. In some examples, theimpression designator 406 stores the AAM-adjusted EMM census data 503 inthe EMM census database 214 (FIG. 2).

At block 618, the probability calculator 402 calculates a probabilityscore for each of the secondary user(s) 204 in the primary user's 202household. At block 620, the impression designator 406 associates thedemographic information of the secondary user 204 with the highestprobability score calculated at block 618 with the impression retrievedat block 602 to create the AAM-adjusted EMM census data 503. In someexamples, the impression designator 406 stores the AAM-adjusted EMMcensus data 503 into the EMM census database 214. At block 622, theprobability calculator 402 determines whether there is anotherimpression to be analyzed for an activity assignment. If there isanother impression to be analyzed for an activity assignment, programcontrol returns to block 602. Otherwise, if there is not anotherimpression to be analyzed for an activity assignment, the exampleprogram 600 of FIG. 6 ends.

FIG. 7 is a flow diagram representative of example machine readableinstructions 700 that may be executed to implement the exampleassignment modeler 216 of FIGS. 2 and 4 to construct an activityassignment model (e.g., the activity assignment model 506 of FIG. 5).Initially, at block 702, the assignment modeler 216 selects attributesto include in the activity assignment model 506. For example, theattributes may be information related to demographic information (e.g.,gender, race/ethnicity, education level, age, gender, etc.), informationrelated to households (e.g., household size, primary/secondary householdlanguage, etc.), information related to media (e.g., genre, daypart,etc.), and/or information related to the portable device 106 (FIG. 1)(e.g., device type, operating system, etc.). At block 704, theassignment modeler 216 uses the selected attributes to construct acandidate model with a training set of known EMMC data. The examplecandidate model may generated using any suitable technique such asgradient boost regression modeling technique, a k-nearest neighbormodeling technique, etc.

At block 706, the candidate model is evaluated using a validation set ofknown EMM census data. For example, a demographic impression from theEMM census data is input into the candidate model. In that example, theoutput of the candidate model (e.g., which member of the household thecandidate model associated with the demographic impression) is comparedto the known answer (e.g., the actual member of the household associatedwith the demographic impression. In some examples, a correct probabilityrate (CPR) is calculated by determining what percentage of thevalidation set the candidate model predicted correctly. For example, ifthe candidate model predicts sixty-five out of a hundred demographicimpressions correctly, the CPR is 65%. In some examples where multiplevalidation sets are used, the CPR is an average value of the percentageof correct predictions. At block 708, the assignment modeler 216determines whether the CPR satisfies (e.g., is greater than or equal to)a threshold. In some examples, the threshold is based on an errortolerance of customers of the AME 108 and/or a known amount of error inthe training data set. If the CPR satisfies the threshold, programcontrol advances to block 710. Otherwise, if the CPR does not satisfythe threshold, program control advances to block 712.

At block 710, the assignment modeler 216 sends the activity assignmentmodel to the impression corrector 210. The example program 700 thenends. At block 712, the assignment modeler 216 adjusts and/or selectsthe attributes used in the candidate model. In some examples, to adjustthe attributes used in the candidate model, the assignment modeler 216selects one or more attributes that were not included in the candidatemodel generated at block 704. In some examples, the assignment modeler216 selects the attributes based on a relative influence of eachattribute. The relative influence indicates the predictive weight of thecorresponding attribute on the probability score (but does not show howa particular attribute contributes to an individual probability score).For example, an attribute with a 53% relative influence will contributeto the probability score (e.g., the value of the attribute will affectthe outcome of the activity assignment model) for 53% of the possibleprobability scores. In some such examples, the assignment modeler 216calculates the relative influence of each of the attributes used in thecandidate model. In some such examples, the assignment modeler 216discards the attributes that have an influence below a threshold and/orpicks the attributes with the highest influence (e.g. that add up to atarget relative influence). Control returns to block 704 at which a newactivity assignment model is constructed.

FIG. 8 is a block diagram of an example processor platform 800structured to execute the instructions of FIGS. 6 and/or 7 to implementthe example impression corrector 210 and/or the example assignmentmodeler 216 of FIGS. 2 and/or 4. The processor platform 800 can be, forexample, a server, a personal computer, a workstation, or any other typeof computing device. In some examples, separate processor platforms 800may be used to implement the example impression corrector 210 and theexample assignment modeler 216.

The processor platform 800 of the illustrated example includes aprocessor 812. The processor 812 of the illustrated example is hardware.For example, the processor 812 can be implemented by one or moreintegrated circuits, logic circuits, microprocessors or controllers fromany desired family or manufacturer.

The processor 812 of the illustrated example includes a local memory 813(e.g., a cache). The example processor 812 implements the examplepanelist identifier 400, the example probability calculator 402, theexample calibration calculator 404, and the example impressiondesignator 406 of the impression corrector 210. The example processoralso implements the example assignment modeler 216. The processor 812 ofthe illustrated example is in communication with a main memory includinga volatile memory 814 and a non-volatile memory 816 via a bus 818. Thevolatile memory 814 may be implemented by Synchronous Dynamic RandomAccess Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUSDynamic Random Access Memory (RDRAM) and/or any other type of randomaccess memory device. The non-volatile memory 816 may be implemented byflash memory and/or any other desired type of memory device. Access tothe main memory 814, 816 is controlled by a memory controller.

The processor platform 800 of the illustrated example also includes aninterface circuit 820. The interface circuit 820 may be implemented byany type of interface standard, such as an Ethernet interface, auniversal serial bus (USB), and/or a PCI express interface.

In the illustrated example, one or more input devices 822 are connectedto the interface circuit 820. The input device(s) 822 permit(s) a userto enter data and commands into the processor 812. The input device(s)can be implemented by, for example, an audio sensor, a microphone, acamera (still or video), a keyboard, a button, a mouse, a touchscreen, atrack-pad, a trackball, isopoint and/or a voice recognition system.

One or more output devices 824 are also connected to the interfacecircuit 820 of the illustrated example. The output devices 824 can beimplemented, for example, by display devices (e.g., a light emittingdiode (LED), an organic light emitting diode (OLED), a liquid crystaldisplay, a cathode ray tube display (CRT), a touchscreen, a tactileoutput device, a printer and/or speakers). The interface circuit 820 ofthe illustrated example, thus, typically includes a graphics drivercard, a graphics driver chip or a graphics driver processor.

The interface circuit 820 of the illustrated example also includes acommunication device such as a transmitter, a receiver, a transceiver, amodem and/or network interface card to facilitate exchange of data withexternal machines (e.g., computing devices of any kind) via a network826 (e.g., an Ethernet connection, a digital subscriber line (DSL), atelephone line, coaxial cable, a cellular telephone system, etc.).

The processor platform 800 of the illustrated example also includes oneor more mass storage devices 828 for storing software and/or data.Examples of such mass storage devices 828 include floppy disk drives,hard drive disks, compact disk drives, Blu-ray disk drives, RAIDsystems, and digital versatile disk (DVD) drives.

Coded instructions 832 to implement the example machine readableinstructions of FIGS. 6 and/or 7 may be stored in the mass storagedevice 828, in the volatile memory 814, in the non-volatile memory 816,and/or on a removable tangible computer readable storage medium such asa CD or DVD.

From the foregoing, it will be appreciated that examples have beendisclosed which allow accurate association of demographic data withimpressions generated through exposure to media on a portable devicewithout requiring individual members of a household to self-identify. Insuch an example, computer processing resources are conserved by notrequiring the processor to execute an additional application used tofacilitate self-identification. Advantageously, network communicationbandwidth is conserved because an additional self-identificationapplication does not need to be maintained (e.g., downloaded, updated,etc.) and/or does not need to communicate with the AME 108.

Additionally, it will be appreciated that examples have been disclosedwhich enhance the operations of a computer to improve the accuracy ofimpression-based data so that computers and processing systems thereincan be relied upon to produce audience analysis information with higheraccuracies. In some examples, computers operate more efficiently byrelatively quickly correcting misattributions in EMM census data s. Insome examples, the corrected EMM census data is used to generateaccurate misattribution correction factors (e.g., by calculatingaccurate device sharing probabilities, etc.). Such accuratemisattribution correction factors are useful in subsequent processingfor identifying exposure performances of different media so that mediaproviders, advertisers, product manufacturers, and/or service providerscan make more informed decisions on how to spend advertising dollarsand/or media production and distribution dollars.

In some examples, using example processes disclosed herein, a computercan more efficiently and effectively determine misattribution errorcorrection factors in impression data logged by the AME 108 and thedatabase proprietors 104 a-b without using large amounts of networkcommunication bandwidth (e.g., conserving network communicationbandwidth). For example, the computer conserves processing resources arenot needed to continuously communicate with non-panelist individualonline users (e.g. online users without an ongoing relationship with theAME 108) to request survey responses (e.g. probability surveys, etc.)about their online media access habits. In such an example, the AME 108does not need to rely on such continuous survey responses from suchonline users. In some examples, survey responses from online users canbe inaccurate due to inabilities or unwillingness of users to recollectonline media accesses and/or survey responses can also be incomplete. Bynot requiring survey results from non-panelists, processor resourcesrequired to identify and supplement incomplete and/or inaccurate surveyresponses is eliminated.

Although certain example methods, apparatus and articles of manufacturehave been disclosed herein, the scope of coverage of this patent is notlimited thereto. On the contrary, this patent covers all methods,apparatus and articles of manufacture fairly falling within the scope ofthe claims of this patent.

What is claimed is:
 1. An apparatus to correct media measurement datagenerated by a server, comprising: at least one memory; instructions;and at least one processor to execute the instructions to: generateelectronic mobile measurement data based on network communicationsreceived from first client devices, the network communicationscorresponding to media accessed at the first client devices; selectattributes associated with the electronic mobile measurement data toinclude in a model; generate the model based on the attributes and afirst portion of the electronic mobile measurement data; determine apercentage of a second portion of the electronic mobile measurement datathat the model correctly associates with corresponding first users ofthe first client devices; and when the percentage satisfies a threshold:determine: (a) when a second user operating a second client device is aprimary user based on the model, and (b) when the user operating thesecond client device is one of a plurality of third users based on themodel; and associate demographic information of the second user with theelectronic mobile measurement data to reduce a misattribution error, themisattribution error generated by an impression server when generatingthe electronic mobile measurement data.
 2. The apparatus of claim 1,wherein the at least one processor is to execute the instructions toassociate the demographic information of the second user operating thesecond client device with the electronic mobile measurement data toreduce processing resources on the second client device by identifyingthe second user operating the second client device without requiring thesecond user operating the second client device to self-identify.
 3. Theapparatus of claim 1, wherein the attributes are first attributes andthe model is a first model, and in response to the percentage notsatisfying the threshold, the at least one processor is to execute theinstructions to: select second attributes associated with the electronicmobile measurement data; generate a second assignment model based on thesecond attributes and the first portion of the electronic mobilemeasurement data; identify the second user operating the second clientdevice as at least one of the primary user or one of the plurality ofthird users based on the second model; and associate the demographicinformation of the second user with the electronic mobile measurementdata to reduce the misattribution error.
 4. The apparatus of claim 1,wherein the attributes include at least one of demographic information,household information, media information, or information related to thefirst client devices.
 5. The apparatus of claim 1, wherein the thresholdis based on at least one of an error tolerance of a customer of anaudience measurement entity or a known amount of error in the firstportion of the electronic mobile measurement data.
 6. The apparatus ofclaim 1, wherein when a size of a household of the second user operatingthe second client device equals one, the at least one processor is toexecute the instructions to identify the primary user as the second useroperating the second client device.
 7. An apparatus, comprising: atleast one memory; instructions; and at least one processor to executethe instructions to: log an impression based on a communication receivedfrom a client device, the logged impression corresponding to mediaaccessed at the client device by a user operating the client device;determine that a primary user of the client device is not the useroperating the client device when a first probability score calculatedfor the primary user does not satisfy a threshold, the first probabilityscore indicative of a probability that the primary user is an audiencemember of the media accessed at the client device, the primary userbeing a household member of a household; in response to thedetermination that the primary user is not the user operating the clientdevice: determine probability scores for corresponding ones of secondaryusers, the secondary users being household members residing in thehousehold with the primary user, the probability scores indicative ofprobabilities that corresponding ones of the secondary users are theuser operating the client device; and identify one of the secondaryusers as being the user operating the client device based on the one ofthe secondary users corresponding to a highest probability score of theprobability scores; and reduce processing resource requirements of theclient device by identifying the user operating the client device toassociate demographic information of the user operating the clientdevice with the logged impression without requiring the user operatingthe client device to self-identify.
 8. The apparatus of claim 7, whereinwhen a size of the household equals one, the at least one processor isto execute the instructions to identify the primary user as the useroperating the client device.
 9. The apparatus of claim 7, wherein whenthe primary user has indicated that the primary user does not share theclient device, the at least one processor is to execute the instructionsto identify the primary user as the user of the client device.
 10. Theapparatus of claim 7, wherein the threshold is a calibration factordivided by a size of the household.
 11. The apparatus of claim 10,wherein the calibration factor is based on second demographicinformation of the primary user and a type of the client device.
 12. Theapparatus of claim 11, wherein the second demographic information of theprimary user and the type of the client device correspond to a firstdemographic group, and the calibration factor is a ratio of an averagetime that the first demographic group accessed the media and an averagetime that all demographic groups accessed the media.
 13. At least onenon-transitory computer readable medium comprising instructions that,when executed, cause at least one processor to at least: log animpression based on a communication received from a client device, thelogged impression corresponding to media accessed at the client deviceby a user operating the client device; determine that a primary user ofthe client device is not the user operating the client device when afirst probability score calculated for the primary user does not satisfya threshold, the first probability score indicative of a probabilitythat the primary user is an audience member of the media accessed at theclient device, the primary user being a household member of a household;in response to the determination that the primary user is not the useroperating the client device: determine probability scores forcorresponding ones of secondary users, the secondary users beinghousehold members residing in the household with the primary user, theprobability scores indicative of probabilities that corresponding onesof the secondary users are the user operating the client device; andidentify one of the secondary users as being the user operating theclient device based on the one of the secondary users corresponding to ahighest probability score of the probability scores; and reduceprocessing resource requirements of the client device by identifying theuser operating the client device to associate demographic information ofthe user operating the client device with the logged impression withoutrequiring the user operating the client device to self-identify.
 14. Theat least one non-transitory computer readable medium of claim 13,wherein when a size of the household equals one, the instructions, whenexecuted, cause the at least one processor to identify the primary useras the user operating the client device.
 15. The at least onenon-transitory computer readable medium of claim 13, wherein when theprimary user has indicated that the primary user does not share theclient device, the instructions, when executed, cause the at least oneprocessor to identify the primary user as the user of the client device.16. The at least one non-transitory computer readable medium of claim13, wherein the threshold is a calibration factor divided by a size ofthe household.
 17. The at least one non-transitory computer readablemedium of claim 16, wherein the calibration factor is based on seconddemographic information of the primary user and a type of the clientdevice.
 18. The at least one non-transitory computer readable medium ofclaim 17, wherein the second demographic information of the primary userand the type of the client device correspond to a first demographicgroup, and the calibration factor is a ratio of an average time that thefirst demographic group accessed the media and an average time that alldemographic groups accessed the media.